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Abstract 

We shall prove new contraction properties of general transportation costs along nonnega- 
tive measure- valued solutions to Fokker-Planck equations in R d , when the drift is a monotone 
(or A-monotone) operator. A new duality approach to contraction estimates has been devel- 
oped: it relies on the Kantorovich dual formulation of optimal transportation problems and 
on a variable-doubling technique. The latter is used to derive a new comparison property of 
solutions of the backward Kolmogorov (or dual) equation. The advantage of this technique 
is twofold: it directly applies to distributional solutions without requiring stronger regularity 
and it extends the Wasserstein theory of Fokker-Planck equations with gradient drift terms 
started by Jordan-Kinderlehrer-Otto [14] to more general costs and monotone drifts, 
without requiring the drift to be a gradient and without assuming any growth conditions. 



1 Introduction 

The aim of this paper is to obtain new uniqueness and contractivity results for nonncgativc 
measure-valued solutions to the Fokker-Planck equation 

d tP -Ap-VipB) = 0, P\ t=a = Po, (1) 

where B : R d — > M. d is a Borel A-monotone operator, A G R, i.e. 

(B(x) — B(y), x — y) > A \x — y\ 2 for every x, y e R d . (2) 

Here we consider a weakly continuous family of probability measures (pt)t>o C ^(R d ) satisfying 
the equation (1) in the sense of distributions 

/ + / f d t C + AC - B • VC) dptdt = V(e C c °°(R d x (0,+oo)), (3) 

with the initial datum pg. 

Equations of this type are the subject of several papers by Bogachev, Da Prato, Krylov, 
Rockner, and Stannat, who consider a very general situation where the Laplacian is replaced 
by a second order elliptic operator with variable coefficients and B is locally bounded. Existence 
of solutions has been proved by [6, Cor. 3.3], uniqueness has been considered in [5] under general 
growth-coercivity conditions on B, and regularity has been investigated by [7]: in particular, it has 
been shown that pt is absolutely continuous with respect to the Lebesgue measure for Jz? ^a.e. t. 

When B is Lipschitz continuous, uniqueness can be obtained by standard duality arguments, 
see e.g. [3, Sec. 3]. Here we want to obtain a more precise stability estimate on the solutions of (1), 
only assuming monotonicity of B without any growth condition. To achieve this aim, we adopt 
the point of view of optimal transportation. 
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The Wasserstein approach to Fokker- Planck equation in the gradient case. When B 
is the gradient of a A-convex function V : R d — > R then (1) can be considered as the gradient flow 
of the perturbed entropy functional 

M?{p) := / u(x)logu{x)dx+ [ V{x)dp(x) p = u£f d (4) 

JR d JR d 

in the space & 2 (W l ) of probability measures with finite quadratic moments endowed with the so 
called L 2 -Kantorovich- Rubinstein- Wasserstein distance W 2 (-, •)■ This distance can be defined by 

W 2 (p 1 ,p 2 ) :=min{ f \x x - x 2 \ 2 dp(x u x 2 ) : p e &(R d x R d ) 

^ JR d xR d (5) 

p is a coupling between p 1 and p 2 j 

in terms of couplings, i.e. measures p in the product space M. d x M. d whose marginals are p 1 and p 2 
respectively, so that p(E x R d ) = p 1 (E) and p(R d x E) = p 2 (E) for every Borel subset E C M d . 
It is possible to prove that optimal couplings realizing the minimum in (5) always exist. 

This remarkable interpretation found in [14] gave rise to a series of studies on the relationships 
between certain classes of diffusion equations and distances between probability measures induced 
by optimal transport problems (see e.g. the general overviews of [21, 2, 22]). One of the strengths 
of this approach is a new geometric insight (developed in [16]) in the evolution process: in the case 
of (1) the A-convexity of the potential V reflects a A-convexity property (also called displacement 
convexity) of the functional J$? along the geodesies of & 2 (M. d ). This nice feature, discovered 
by [15], suggests that one can adapt some typical basic existence, approximation, and regularity 
results for gradient flows of convex functionals in Euclidean spaces or Riemannian manifolds to 
the measure-theoretic setting of 3 a 2 {W l ). This program has been carried out (see e.g. [2]) and, 
among the most interesting estimates, it provides the A-contraction property 

W 2 (pl,p 2 t ) < c~ xt W 2 (plp 2 ) for every * > 0, (6) 

where p\, i — 1,2, are the solutions to (1) starting from the initial data p l € & 2 (M. d ). 



Two strategies for the derivation of the contraction estimate (6) in the gradient case. 

In order to prove (6) in the gradient case B = VV, essentially two basic strategies have been 
proposed: 

1. A first approach, developed by [10] for smooth evolutions and by [2] in a measure-theoretic 
setting, starts from equation (1) written in the form 

d t p + V-(pv) = 0, v = -(—+w), P = uJ? d , (7) 

and it is based on two ingredients: the first one is the formula which evaluates the derivative 
of the squared Wasserstein distance from a fixed measure a along the (absolutely continuous) 
curve p in ^> 2 {^ d ) 



l V 2 W)= / (v t (x),y-x)d Pt (x,y) for ^-a.e. t > (8) 



dt 2 



where p t is an optimal coupling between p t and a. 

The second ingredient is the "subgradicnt" property of the vector field v t given by (7), 
related to the displacement convexity of JF: in the case A = it reads as 

(v t (x),y-x)d Pt (x,y)<Jlf(a)-Jf(pt) if v t = -(— + w). (9) 
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Combination of (8) and (9) yields the so called Evolution Variational Inequality 

if l W i(P^ a ) ^ ~ -^(P*) for ever y a e ^(K d ) (10) 

which easily yields (6) for A = by a variable-doubling argument (see [2, Theorem 11.1.4]). 

The main technical point here is that (9) requires Vt £ L 2 {p t ) and (8) holds if for every 
< t Q < ti < +oo 



to JR d 



\v t \ dp t dt 



f"I 




h JR d 


Ut 



dpt dt < +oo, (11) 



which should be imposed (in a suitable distributional sense) as an a priori regularity assump- 
tion on the solution of (1). We do not know if solutions to (3) exhibit a similar regularization 
effect. A second, even more difficult point prevents a simple extension of (10) to the general 
non-gradient case: it is the lack of a potential V and therefore of an entropy-like functional 
Jt? satisfying an inequality similar to (9). 

2. A second approach has been proposed by [17] and further developed in [12, 9]: it is based 
on the Benamou-Brenier [4] representation formula for the Wasserstein distance 

-2 



W 2 (Po,Pi) = inf • 



{ f f \v t \ 2 dp t dt : 

L Jo JR d (12) 
dtPt + V ■{p t v t ) = 0mR d x (0,1), p = p\ t=Q , Pi = P\ t=1 } 



and on a careful analysis of the effect of the evolution semigroup generated by the equation 
on curves in 3 g 2 (^- d ) and its Riemannian tensor J Rd \v\ 2 dp. This technique involves various 
repeated differentiations and works quite well if a nice semigroup preserving smoothness and 
strict positivity of the densities has already been defined. Once contraction has been proved 
on smooth initial data, the evolution can be extended to more general ones but it seems hard 
to extend the uniqueness result to cover a general distributional solution to the equation. 

Main result of the paper: contraction estimates for distributional solutions. Our 

purpose is twofold: 

• First of all we want to find a new approach working directly on measure- valued solutions to 
(1) just satisfying the usual distributional formulation (3). 

We note that in general (1) does not exhibit the same regularization effect of the heat 
equation. Even in the gradient case B = VV, there exist solutions pt to (3) which are 
not of class C 1 (R d ) for every t > 0: take, e.g., the invariant measure pt = Z~ l c~ v for a 
suitable convex function V £ C 1 (M. d ) with e~ v € L 1 (R d ). Moreover, distributional solutions 
are easily obtained by approximation arguments, as regularization or splitting methods, and 
they should be better suited to deal with the infinite dimensional case, as in [3] : a stability 
result for such a weak class of solutions should be useful in these cases. 



Second, we want to cover the case of an arbitrary monotone field B, without any growth 
restriction, and to extend contraction estimates to more general transportation costs. 



To this aim, let us first introduce the general cost functional 

C/i(p 1 ,p 2 ) := inf I / h(\x 1 - x 2 \) dp(x 1 ,x 2 ) ■ p E 

L JR d xR d 

p is a coupling between p 1 and p 2 j . 
Throughout this paper we assume that 



d x R d ) 



(13) 
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h : [0, +oo) — > [0, +oo) is a continuous and non-decreasing function with h(0) = 0. 

Among the possible interesting choices of h, the case h(r) := r p is associated to the family 
of IP Wasserstein distances (whose L 2 -version has been introduced in (5)) on the space 3 g p (M. d ) 
of all the probability measures with moment of order p. When h is a bounded concave function 
satisfying h(r) > if r > 0, d(x,y) := h(\x — y\) is a bounded and complete distance function on 
W l inducing the usual euclidean topology so that Ch{-, ■) is a complete metric on the space 3 g (M. d ) 
whose topology coincides with the usual weak one (see, e.g., [2, Proposition 7.1.5]). 

Since we are not assuming any homogeneity on the general cost function h, its rescaled versions 

h s (r) := h(rc s ) s G R, r > (14) 
will be useful. Let us now state our main result: 

Theorem 1.1. If p x ,p 2 are two distributional solutions to (3) satisfying the summability condition 
[ I \B{x) — Xx\ dpt(x) dt < +oo for every < t < t\ < +oo, (15) 

Jt JR d 

then they satisfy 

Ch xt (Pt:Pt) <Ch{pl,pl) for every i > 0. (16) 
In particular, if p\ = pg then p 1 and p 2 coincide for every time t > 0. 

Let us make explicit some consequences of (16) according to the different signs of A and the 
behaviour of h near and +oo: 

Corollary 1.2. Let p 1 ,^ 2 be two distributional solutions to (3) satisfying (15). 

a) If B is monotone, i.e. A > 0, then 

C h {p\,p 2 t ) < C h (pl,pl). 

b) If B is X-monotone with A > and h satisfies for some exponent p > 

h(ar) > a p h(r) for every a > 1 and r > (17) 

then 

C h {p\,p 2 t )< C - pXt C h { P lpl). 

c) If B is X-monotone with A < and h satisfies for some exponent p > 

h(ar) > a p h(r) for every a < 1 and r > 

then 

C h {p\,p 2 )< C - pXt C h { P lp 2 ). 

In the particular case of the Wasserstein distance W p , p > 1, we have 

W p (plp 2 )<e- xt W p (plp 2 ). (18) 

Theorem 1.1 has a simple application to invariant measures p^ G &(R d ), which are stationary 
solutions of (3) and therefore satisfy 



/ (AC - B ■ V() d Poc = VCGC c °°(K d ). 



(19) 
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Corollary 1.3 (Strongly monotone operators and invariant measures). Let us suppose that B 
is strongly monotone, i.e. A > 0. Then equation (19) has at most one solution p^ G &(W. d ) 
satisfying the integrability condition 



/ \Bx — \x\ dpoo(x) < oo. 

Jm<* 



(20) 



For each solution pt to (3)-(15) and each cost h satisfying (17) we have 

C h {puPoo) < c- pA(t - to) C ft (p to , Poo ). (21) 

Note that in the case A > condition (20) is weaker than B G L 1 (p OQ ;'R d ). 

Remark 1.4 (An equivalent formulation of the contraction estimate). We can give an equivalent 
version of (16) by keeping fixed the cost but rescaling the measures. In fact, we can associate to 
the solutions p 1 ,p 2 of (3) their rescaled versions p 1 ,/) 2 defined by 

pi(E) := p>{e- xt E) for every Borel set E C R d , j = 1, 2. (22) 

Then jp is the push-forward of pP through the map x >-> e xt x and satisfies the change-of-variables 
formula 

C(y) dp> (y) = ( C(e A * x) dp' (x) for every C G C h (R d ). (23) 



Inequality (16) is then equivalent to 

C h (~plp 2 t) <C h {pl,pl) for every < > 0. (24) 

Strategy of the proof: Kantorovich duality and a variable-doubling technique. In 

order to prove Theorem 1.1 we develop a new strategy, generalizing [18]. It relies on the well- 
known dual Kantorovich formulation [21] of the transportation cost (13): 

C h (p\p 2 ) = sup { f ( j> 1 dp 1 +( tfdp 2 : 

v JR d Jm d (25) 

0\</) 2 GC b (M rf ), 4> 1 {x l ) + ct> 2 {x 2 )<h{\x l -x 2 \)). 

This formula reduces the estimate of the cost C^(py, p 2 ^) of two solutions of (1) at a certain final 
time T to the estimate of 

V{4>\4> 2 -T);= [ c/> 1 dp 1 T + [ 4> 2 dpl (26) 

JR d JR d 

for an arbitrary pair of functions (j) 1 , 2 satisfying the constraint 

x Oi) + 4> 2 {x 2 ) < h(\x! -x 2 \) for every x 1: x 2 G M. d . (27) 

Assuming for the sake of simplicity that B is monotone, bounded and smooth, we can obtain an 
estimate of S(^) 1 , (j) 2 \ T) by solving the final- value problem for the adjoint equation 

d t <fr l + A(p l - B ■ V</>' = inR d x(0,T), <//(•, T) := (28) 

since the distributional formulation (3) yields 

E(^,4;T)=S(^,^ ;0 ) (29) 

The following crucial result, based on a "variable-doubling technique", provides the final step, 
showing that 4>q,4>q still satisfy the constraint (27) so that £(</>g, </>%; 0) < Ch(po,Po)- 
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Theorem 1.5. Ifcj) 1 ,^ 2 G C 2,1 (BL d x [0,T]) are solutions of (28) in the case when B is monotone, 
bounded and smooth, such that 

4> 1 (x 1 ,T) + 4 > 2 {x 2 ,T) < h(\ Xl - x 2 \) Vx u x 2 G R d , 

then 

(j) 1 (x 1 ,0) + 4: 2 (x2,0) < h(\xi_ -x 2 \) Vxi.aa G R d . 

Remark 1.6. While we prove Theorem 1.5 for bounded and smooth drifts B, and solutions 
(j> 1,2 G C h ' (R rf x [0,T]), the property clearly carries over to any pointwise limit of such solutions. 
We therefore expect it to hold for a much larger class of monotone drifts B and solutions. 



Plan of the paper In section 2, we collect some tools useful to our arguments: we present a 
slightly refined version of Kantorovich duality, an approximation technique of the cost functional, 
the construction of a smooth and bounded approximation of the operator B, and a rescaling trick 
which allows to consider A = in the following arguments. Section 3 is devoted to the proof of 
Theorem 1.5, the last Section contains the proof of Theorem 1.1. 



2 Preliminaries 

In this section we collect some preliminary and technical rcgularization results which will turn to 
be useful in the sequel. 



2.1 C^°(M d ) functions in Kantorovich duality 

Let us first show that we can assume 4> l , <fi 2 are smooth and compactly supported in the duality 
formula (25). 

Proposition 2.1. If the cost function h is Lipschitz continuous and satisfies lim r -|- +00 h(r) = +oo, 
then 

C h (p\p 2 ) = sup f / 4> l dp 1 + ( dp 2 : 

1 Js, d J«, d (30) 

0\0 2 G C™(R d ), (j> 1 {x 1 ) + (l?{x2) < h(\xi -x 2 \)}. 
Proof. Let us recall that the /i-transform of a given bounded function £ : M. d — > R is defined as 

C h (x) := inf h(\x-y\)-((y), (31) 

y£R d 

and it is a bounded and Lipschitz continuous function satisfying £ (x) + C h (y) ^ Ml x — v\)- 
Let us fix c < Chip 1 iP 2 ) an d admissible (j) 1 , 4> 2 G Cb(R rf ) such that 

j) 1 dp 1 + [ 4? dp 2 > c. (32) 

By possibly replacing </> 2 with ((f> 1 ) h > <ft 2 and (p 1 with (<fi 1 ) hh > (f> 1 , it is not restrictive to 
assume that 1 , <f> 2 are also Lipschitz continuous. Adding to (j) 1 and subtracting from <f> 2 a suitable 
constant, we can also assume that (j) 1 > and (j> 2 < 0. 

Let us now consider a family of mollifiers 

Kr! and of cutoff functions Xr defined by 

k v (x) := j]- d n{x/n), Xr{x) := X(x/R) x G R d , rj, R > 0, (33a) 
where k,X G C c °°(R d ) satisfy 

k>0, / K(x)dx=l, < X < 1, X{x) = if |x| > 1, X(x) = 1 if \x\ < 1/2. (33b) 
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We set (j)^ := (j) 1 * k v and 2 := cf) 2 * K n — S^, where 

S v := sup(0 1 * k v — + sup(0 2 * tirq — </> 2 ) + . 
The definition of 5^ yields 

<^(xi) + 4%(X2) < (j) 1 * K n (xi) - ^(Xl) + </> 2 * K^(x 2 ) - </> 2 (x 2 ) - 5,, + h(\xi - X 2 |) < /l(|xi - X 2 |). 

Moreover, since <j> , 4> 2 are Lipschitz, and </!> 2 converge to , 2 uniformly as n \. 0, so that 
and </> 2 are a smooth admissible pair still satisfying (32) and the sign condition > 0, <j> 2 < 0. 
Let us now choose Ro > such that 

h(r) > sup^>„ for every r > Rq (34) 

Setting <^ ji? := 0* X.R < ^\ we easily have for R> Rq 

inf /i(|x 1 -x 2 |)-^ fl (xi) >0 if |x 2 | >2R>R + R . 

x 1 GTS. d " 

Since 4> 2 iR ■= 4> 2 Xar satisfies 4> 2 AR {x 2 ) = <t> 2 (x 2 ) if |x 2 | < 2R and 4> 2 AR {x 2 ) < for every 
x 2 £ E d , 

it follows that 4^ 4^ m 1S an admissible couple in C£°(M d ), and, for i? sufficiently large, it 
still satisfies (32). □ 

2.2 Regularization of the cost function. 

In this section we shall show that it is sufficient to consider nonnegative, Lipschitz, and unbounded 
costs (as those considered in Proposition 2.1) in the proof of Theorem 1.1. 

Lemma 2.2. If (16) holds for every nonnegative Lipschitz and nondecreasing cost function h with 
lim r -t- +00 h(r) = +oo, then it holds for every continuous and nondecreasing cost h. 

Proof. We first prove that it is sufficient to consider nonnegative Lipschitz costs; in a second step, 
we deal with the asymptotic requirement. 

Step 1: h Lipschitz. Adding a suitable constant we can assume that h(r) > h(0) = 0. We can 
then approximate h from below by the increasing sequence of nonnegative Lipschitz functions 

h n (r) := inf h(s) + n\r - s\ 

s>0 

which satisfies 

= h n (0) < h n (r) < h(r), lim h n (r) = h(r) Vr > 0, 

nf+oo 

the convergence being uniform on each compact interval of [0,+oo). Applying Lemma 2.3 below 
we find 

(37) (16) (37) 

Ch xt {php 2 t ) = lim Ch»(p t ,Pt) < liminf C ;i n(pJ,p 2 ) = C h (p ,p ). 

nt+oo At rif+oo 

Step 2: lim r f+oo h(r) = +oo. Let us set po := + Poi let us introduce the function 

m(r) := p a (R d \rU), U := {x £ R d : |x| < l}, 

and let us consider a sequence r n in [0, +oo) such that 

r :=0, n := 1, r n+1 - r n > r n - r„_i and m{r n+1 ) < 2~ n . 

It is easy to check that r n is a diverging increasing sequence; if g is the piecewise linear function 
satisfying g{r n ) = n, i.e. 

g(r) := n H ^— if r £ [r n ,r n+1 ], 

r n +i - r n 
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then g is Lipschitz continuous, increasing, unbounded, concave, and it satisfies g(0) = and 
G : 



g(\x\) dp (x) = ( g'(r)dr) dp (x) = / (/ g'(r)t r <\ x \ dr) dp (x) 

jR d \J J J R d \J J 

POO +°° rr n +°° 

/ g'(r) m(r) dr = / m(r) dr < m(r n ) < +oo. 

JO i r n ~ r n-l 7r„._i ,„_n 



n=l -in— l n=0 

We can thus consider the perturbed cost 

h £ (r) := h(r) + e g(r) 

which is Lipschitz, increasing, unbounded. Since g is concave, increasing, and (/(0) = 0, we have 

g(\xi -x 2 \) < + N) < g^xxl) +g(\x 2 \) for every x u x 2 £ R d , (35) 

so that if p is an optimal coupling between and p\ for the cost h (we can assume that the 
initial cost is finite), then 



C h (p ,p ) < C h e(p ,p ) < C h (p ,p ) + e / g(\x! - x 2 \)dp {x 1 ,x 2 ) 

jR d xR d 

(35) r , \ 

< C h {pl,pl) + e {g(\x 1 \) + g(\x 2 \))dp (x 1 ,x 2 )=C h (p 1 ,p 2 )+eG. 

JR d xR d v ' 

Therefore, if Theorem 1.1 holds for h £ we have 

Ch{pl,p 2 t ) < C h , { P ],p 2 t ) < C h e{pl, pi) < C h (pl pi) + eG. 

Passing to the limit as e I we conclude. □ 

The following result provides a variant of well known stability properties of transportation 
costs (see [19, Theorem 3], [22, Theorem 5.20]) and holds the much more general setting of optimal 
transportation in Radon metric spaces [2, Chapter 6]. 

Lemma 2.3 (Lower semicontinuity of the cost functional w.r.t. local uniform convergence of h). 
Let h : [0, +oo) — > [0, +oo) be a continuous cost function and let h n : [0, +oo) — > [0, +oo) be 
a sequence of lower semicontinuous functions converging to h locally uniformly in [0,+oo). For 
every couple p 1 , p 2 G ^(R d ) we have 

liminfCh-V,/? 2 ) > C h { P \p 2 ). (36) 

n\+oo 

In particular, if h n < h for every n € N then 

lim e hn (p\p 2 )=e h (p 1 ,p 2 ). (37) 

n— f +oo 

Proof. Let us set H n (xi,x 2 ) := h n {\x\ — x 2 \) and observe that H n converges to H(xi,x 2 ) := 
h n (\xi — x 2 \) uniformly on compact sets of M d x M. d . If p n G ^(R d x R d ) is an optimal coupling 
between p 1 , p 2 with respect to the cost h n then 

C hn (p 1 ,p 2 )= zdp n (z), where p n = (H n )#p n . 

./[O. + oo) 

Since the marginals of p n are fixed, the sequence (p n )„ 6 N is tight and up to the extraction 
of a suitable subsequence (still denoted by p n ) we can suppose that p n converge to to some 
limit coupling p between p 1 ,/? 2 in ^iW* x M d ). Since p n weakly converge to p = H#p by [2, 
Lemma 5.2.1], standard lower semicontinuity of integrals with nonncgative continuous integrands 
[2, Lemma 5.1.7] yields 



lim inf 

n— >+oo 



[0,+oo) 



zdp„(z)> I zdp(z)= I H{xi,x 2 )dp(xi,x 2 )>C h {p 1 ,p 2 ) 

J[0,+oo) JR d xR d 



□ 
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2.3 Bounded, smooth approximations of a monotone operator 

If A : R d — > R d is a monotone operator then there exists [8, Corollary 2.1] a maximal monotone 
multivalued extension A : M d =4 R d (thus taking values in 2 R ) such that A(x) £ A(x) for every 
x £ R d . We denote by A (a;) the element of minimal norm in (the closed convex set) A(x). 
[1, Corollary 1.4] shows that the set A(x) C R rf reduces to the singleton {A(x)} ^f d -almost 
everywhere: in fact it satisfies 

A(x) = {A°(x)} = {A(x)} for «5f d -a.e. x £ R d , A(x) = conv{ lim A(x„) for some x n -> xj 

n— J-OO 

(38) 

We recall the following important approximation result [13, Theorem 4.1]: we denote by U the 
open unit ball in R d . 

Theorem (Fitzpatrick-Phelps). For every maximal monotone operator A : R d =4 R d , there exists 
a sequence of maximal monotone operators A n : M. d =S M. d such that, for each x £ R d and all n, 

A(x) n nU C A n (x) C nU, A n (x) \A(x) C ndll for every x eR d . (39) 

Notice that (39) yields in particular 

|A°(x)| = min (\A°(x)\,n) for every x e R d . (40) 

Theorem 2.4. Lei A : M. d =| R d &e a maximal monotone operator and (/3 n )„gN a vanishing 
sequence of positive real numbers. There exists a sequence of smooth, globally Lipschitz, and 
bounded monotone operators A n : R d — > R d such that 

Lip(^„)<n, \A n (x)\ < mm (\A°(x)\, n)+/3 n , lim A n (x) = A°(x) for every x € R d . (41) 

n— >+oo 

Proof. Let A„ be a sequence of maximal monotone operators satisfying (39) and let Y„ : R d — > M. d 
be the Moreau-Yosida approximation of A n of parameter n^ 1 [8, Proposition 2.6] 

3^(x) := nfx - (I + n' 1 A„) _1 x) 

Note that Y n is a n-Lipschitz monotone map satisfying 

\Y„(x)\ < |A° (x)\ { = mm (\A°(x)\,n) for every x eM. d (42) 

Let us fix x £ R d and let x„ £ R d be the unique solution of 

x n + n~ x A n (x n ) 3 x so that Y n (x) = n(x — x n ) £ A n (x n ). (43) 

If n > |A°(x)| then (42) yields Y n (x) £ ndll; applying (39) and (42) again we get 

Y n {x) £ A(x n ), \Y n (x)\ < \A°(x)\, \x - x n \ < n' 1 \A°(x)\ for every n > \A°{x)\. (44) 

Since the graph of A is closed, any accumulation point y of the bounded sequence Y n (x) satisfies 

y £ A(x), \y\ < \A°(x)\. (45) 

We thus conclude that lim„-|. +00 Y n (x) = A°(x) for every x £ R d . 

To conclude the proof we need to regularize Y n : to this aim we consider the family of mollificrs 
n n as in (33a) and we set 

A n :=Y n * Kn with , : =(n*)-A. where k := [ \x\ K ( X ) d x, (46) 

JWL d 

so that 

\A n (x) -Y n (x)\ < r)kLip(Y n ) < nrjk < (3 n . 

□ 
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We consider now a radial smoothing: 

Proposition 2.5. Let A n : M. d —> R d be smooth, Lipschitz, and bounded monotone operators 
satisfying (41). For every m £ N there exists bounded, smooth, Lipschitz, and monotone operators 
A n , m such that 

Lip(A„ im ) < n, sup \A n . m (x)\ <n + (3 n , sup |D A n ^ m {x) ■ x\ < 2m (n + f3 n ) (47) 
x£R d xes. d 

lim A nm (x)=A n (x) for every x (48) 

rof+oo 

Proof. We consider a family of mollificrs k v = rj^^-K^/ij) £ C£°(R), where k satisfies 

supp(K) c [0,2], 0<k<k(1) = 1, (1 -x)k'(x) > 0, n(x)dx = l, (49) 



and the function £ C^°(0, +oo) defined by -9(r) := k(— log?*), r > 0. We set 

A n , m (x) := m f + A{rx)§{r m )— (50) 



i) 



The change of variable r = e z shows that 

A n , m {x) =m/ A n {xe~ z ) n{mz)dz = A x n * K 1 /, n (0), where A x n [z) := A n (xe z ) for z £ E. 
It is then easy to check that |DA n ,m < n since 

r . . (41) r , (49) 

|DA„ jln (x)| <m \BA n (xe- z )\e- z n(mz)dz < n e~ v/m n(y) dy < n, 

JWL JR 

and A n ^ m converges pointwise to A n as m t +oo. 
Concerning the second bound of (47) we easily have 

r+°° H / x 

Idr 



D4, m (x) -x = mj T>A n {rx)-xd{r m )dr = mj — (yl„(ra)) i?(r m ) 
= -to 2 / A„(ra;)i?(r m ) — where := r&'(r), 



so that the inequality follows choosing n even and nondecreasing in [0, +oo), so that J Q °° |^'(t")| dr = 
2. □ 

2.4 A-monotonicity and rescaling 

We show here a simple rescaling argument (inspired by [11], where the rescaling technique has 
been applied to a wide class of diffusion equations) , which is useful to deduce the estimates in the 
general A-monotone case to the simpler case of a monotone operator. 

We therefore assume that A ^ 0, and we introduce the time rescaling functions 

s(t):= j\ 2Xr dr=±{e 2Xt -l), t( S ):=^log(l + 2A.s) sG^S^) (51) 

where 

a _j+oo if A > 0, 

bc °-\-l/(2\) if A < 0. (5 ) 
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We associate to a family of probability measures p t , t G [0,T], their rcscaled versions <r s , 
s G [0, -Soo): denned by 

(j s {E) := /9 t ( s )(e~ At(s) E) for every Borel set E C R d . (53) 

If B : M. d — >• M. d is a A-monotone Borel map we set A := B — XI and 

B(y, s) := e -At W.B(e- At W y), A(y, s) = c" At ( s) A( C - At ( s) y) for y G R d , s e R. (54) 

Notice that if S is A-monotone, then A and s), s G [0, Soo), are monotone. 

Proposition 2.6. ^4 continuous family p t G ^(R d ) is a distributional solution of (3) i/ and only 
if the reseated measures a s defined by (53) and (51) satisfy 

[ °° [ (d s ip + A(p-A(-,s)-V<p)do- s ds = G C c °°(R d X (0, £?„,)). (55) 
// p satisfies (15) i/ien 

/ / |A(x, s)| dcr s ds < +oo /or every < so < si < Soo, (56) 

J s Q JM d 

and in this case a satisfies 

/ ip(-,sx)da ai — (p(-,s )da So = / [d s (p + A<p - A(y, s) ■ Vy>) dp s ds. 

Js. d Jsq Jm d v 7 



(57) 



for every test function tp G C b ' (R x [so,si]) vrai/i bounded first and second derivatives. 

Proof. We introduce the change of variable map X(x,t) := (e xt x,s(t)) and for a given smooth 
function tp G C c °°(R d x (O^)) we set C(x,t) := <^(c As W, s(t)) = <p o X. Denoting by (y, s) G 
R d x [0, Soo) the new variables, easy calculations show that in M. d x (0, +oo) we have 

OtC = s' (d s¥ > + Ac- 2A * V y v -y)oX, V x ( = e At V„y> o X 

A X C = c 2A * A^oX B • V X C = c 2A * (B(y, s) • V„y>) o X, 

where we used the fact that B = e xt B o X. In particular we have 

d t C-B- \7 X ( = s' (d s tp - A(y, s) ■ V y p) o X 

We thus have 

/ (dtC + A X (-B- V x () d Pt = s'{t) I (d s tp + A y tp - A(y, s) ■ V y tp) oXd Pt 
v 7 Jvfi v 7 

= s'(*) / ( d s <p + A y tp - A(y, s) ■ Vy<p) da s{t) 

jR d V 7 

since cr s (t)(E) = pt(e~ xt E) for every Borel set E C R d . Eventually we obtain 

/ / (d t ( + A x ( — B ■ V x () dp t dt = f f (d s ip + Ay<p-A(y,s)-V y tp)do- s ds 
Jo JM d v 7 Jo Jm d K 7 

(56) follows by a simple application of the change of variable formula (23), since for every t > 
\A(y,s)\ da s (y) { = ] e" At ^ / \A(c~ Xt ^y)\da s (y) 

JR d 

{ = ] e" At ^ f \A(x)\ dp t(s) (x) = e- At < s > / \B(x) - Xx\ dp t{s) (x). 
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Since t'(s) = c At ' s ) we eventually get for ti = t(sj) 

| A(x, s)\ da s ds = / ^ / \B(x) — Xx\ dp t ^(x)j t'(s) ds 



(15) 

\B(x) — \x\ dpt(x) dt < +oo. 

(57) follows from (55) when <p belongs to C™(R d x [s ,si]). If <p £ C^' 1 ^ x [s ,si]) via a stan- 
dard convolution and truncation argument we find an approximation sequence pt £ C^ > °(]R' 1 x 
[so,si]) such that ipk,dt(pk,Vp>k, Aipk remains uniformly bounded and converge pointwisc to 
tp, dtp, Vy>, Ay? respectively. By (56) we can apply the Lcbesguc Dominated Convergence the- 
orem to pass to the limit in (57) written for p^, thus obtaining the same identity for p. □ 

We conclude this section by a simple remark combining the regularization technique of Sec- 
tion 2.3 and the time rescaling (54). 

Lemma 2.7. Let A := B — XI be a monotone operator, let us consider a sequence A n ^ m , n, m £ N, 
of smooth monotone operators given by Theorem 2.4 and Proposition 2.5, and let us set 

A n , m (y, s) := e- Xt ^A nim (e- Xt ^y) y £ R d , s £ [0, S^) (58) 

defined as in (54), (51). Then A n m are Lipschitz in M. d x [0, S] for every S £ [0, Soo). 

Proof. We just have to check that \d s A n ^ m (-, s)\ is uniformly bounded in M. d x [0, S]: sine t'(s) = 
e -At(s) a s j m pi c calculation yields 

d s A nim (y,s) = -Xe- Xt ^ A n , n (y,s) - Ae- 2At ^DA n , m ( e - At ^y) • y = -\e~ Xt{s) Q n , m (y,s) 
where 

Qn,m( x ) = A n _ rn (x^) + J}A n ^ m (x) ■ X, iGl . 

Since e~ At ( s ) is uniformly bounded with all its derivative in each compact interval [0,5], S < oo, 
(47) show that Q n ,m is bounded and therefore A n _ rn is Lipschitz with respect to s. □ 

3 A comparison result for the backward equation 

In this section we give the proof of Theorem 1.5 in a slightly more general form, in order to be 
applied to (a suitably regularized version of) the rescalcd formulation considered in Proposition 2.6. 
Let us suppose that A : (y, s) £ R d x [0, Soo) — > A(y, s) £ M. d is a smooth vector field satisfying 

sup \A S \ + \d s A\ + \BB\ < +oo for every S £ [0, S x ), (59) 

K d x[0,S] 

A(-, s) is monotone for every s £ [0, Soo). (60) 

We denote by Jzf [•] the differential operator defined by 

J?[<p](y,s):=A y <p(y,s)-A(y,s)-V y cp(y,s) p(- , s) £ C 2 (R d ) , (y, s) £ R d x [0, SJ. (61) 

Thanks to (59) and (60), we can apply the existence result [20, Theorem 3.2.1] and for every 
S £ [0,5oo) and £ C c °°(R d ) we can find a solution p £ C* 2,1 (R d x [0,S)) of the backward 
evolution equation 

d„<p + Sf[ip] =0 inR d x[0,S], tp(-, S) = $(■). (62) 

We have 
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Theorem 3.1. Let h : [0, +oo) — > K be a continuous and non-decreasing function. Let (p 1 ,^ 2 G 
C^' 1 (R d x [0,S]) be solutions of the "backward" inequality 

d s ip + Sf[(p] > fflM^x [0, 5] (63) 

such that 

<p 1 {y 1 ,S) + <p 2 {y 2 ,S) < h(\yx-y 2 \) for every yi ,y 2 e R d . 

Then 

<P l (yi,fy + f 2 (V2,0) < h(\yi - 2/2 1) /or even/ yx, j/ 2 £ R d - 

Proof. By approximating ft, from above, it is not restrictive to assume that h € C 1 [0, +oo) with 
/i'(0) = 0; in particular the map H(yi, y 2 ) '■= h(\yi — y 2 \) is of class C 1 in M. d x K d and satisfies 

^viH{yi,y 2 ) = -V y2 H( yi ,y 2 ) = g( yi , y 2 )(yi - y 2 ), (64) 

where 

<g(yi,y 2 ) =5(2/2,2/1) HJ" 1 "" 2 ' ./ ^ ' (65) 

[0 nyi = y 2 . 

The argument combines a variable-doubling technique and a classical variant of the maximum 
principle. Let us first show that if ip 1 , ip 2 satisfy the strict inequality 

d s ip> + ££\<p?\ > in R d x [0, S), j = 1, 2. (66) 

then the function 

/(yi,J/2,s) := ^(yiis) + tp 2 (y 2 ,s) - H(y 1 ,y 2 ) 

cannot attains a (local) maximum in a point (yi,y 2 ,s) with s < S. We argue by contradiction 
and we suppose that (?/i, y 2 , s) is a local maximizcr of / with s < S; we thus have 

d s f(yi,y 2 ,s) < 0, V y J(yi,y 2 ,s) = 0, V y2 f(yi, y 2 , s) = 0; 

so that 

5 t /(yi,s)+5t/(y 2 ,s) <0 (67) 

Vj^d/^s) = v^i? (2/1,2/2) = g{yi,m)(yi -y 2 ) 

Vy 2 <^ 2 (y 2 ,s) = V V2 H(y l7 y 2 ) = g(yi,y 2 )(y 2 - yi). 

It follows that 

A(yi,s) • V yi v? 1 (yi,s)+i(y2,s) ■ V 92 ^&,5) 

= 5(2/1, 2/2) (A(yi,s) - A(# 2 ,s)) ■ (yi - y 2 ) > 
On the other hand, since H(y~i + z,y 2 + z) = H(fji, y 2 ), the function 

R d 3 z^r p x {y x +z,s) + p 2 (y 2 + z,s) - H{y x ,y 2 ) = f(y x + z,y 2 + z,s) 
has a local maximum at z = so that 

A yi <p\y 1 ,S) + A y2( p 2 (y 2 ,s) < 0. (69) 
Combining (67), (68), and (69) we obtain 

(a^ 1 + J2V])(yi, S) + (d^ 2 + ^b 2 ])(y 2 , a) < 0, 

which contradicts (66). 



(68) 
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Suppose now that tp 1 , ip 2 satisfy the inequality (63) and let us set for s, S > 
tfejiVh 8 ) ^{V^s) -8{S-s) -ec~ s \y.j\ 2 j = 1,2. 

We easily get 

d a <4,8 = d s ^ + S + ec- s \ yj \ 2 
&[<4 iS ] = ~ e" s (d£ + 2eA( Vj , s) ■ y s ) 

d aV l iS +&[<fi tB ] > S + ee- s (\ yj \ 2 -d- C n \ Vj \), 

where C„ = sup y s \A n (y,s)\ < +oo. 

It follows that for every 5 > there exists a coefficient e > sufficiently small such that 
tpl $,<p 2 g satisfy (66). On the other hand, the continuous function 

(yi,V2,s) h-> f e ,s{yi,y2,s) := ipl tS ( yi ,s) + <pl iS (y2,s) - h(\yi - y 2 \) j/i,2/2 G K d , s e [0,5], 

attains its maximum at some point (2/1, 2/2, s) G K d x R d x [0,5]; by the previous argument, we 
conclude that s = 5 and therefore for every 2/1,2/2 G M d 

^, a (2/i ! 0) + ^,5(j/2 ! 0)-/i(|yi-y2|)</ £l 5(27i,272 ! 5) < ^(m, S) + <p 2 (y 2 , S) - h(\ yi - y 2 \) < 0. 

Passing to the limit as e, S 4- we conclude. □ 

We conclude this section by recalling two well known estimates: 

Lemma 3.2 (Uniform estimates). Let ip e Cl'^W 1 x [0, 5]) n C°°(R d x (0, 5)) be the solution of 
(62). Then 

sup \<p\ < sup \<fi\, sup V(ys| < sup |V0|. (70) 

R d x[0,S] R d K d x[0,S] R d 

Proof. The first inequality is direct application of the maximum principle (see e.g. [20, Theo- 
rem 3.1.1]. By differentiating the equation with respect to y we obtain 

d s Dtp + _Sf [Dy>] — DAD^ = 

and then 

^d t \Dp\ 2 + ^{\Dp\ 2 } - BABp ■ Dtp - \B 2 ^\ 2 = 0. 
Since A is monotone the quadratic form associated to DA is nonncgative and therefore 

d t \Dcp\ 2 + J?[\D<p{ 2 } > 0. 

A further application of the maximum principle yields (70). □ 

4 Proof of Theorem 1.1 

We split the proof in various steps. Just to fix some notation, we consider a family A njTn of smooth, 
bounded, Lipschitz, and monotone operators approximating A := B — XI as in Proposition 2.5 
and their rescaled version A n ^ m defined by (58). JS? n , m [-] are the associated differential operators 

JS?n,mM(tM) := A y Lp(y,s)-A n , m (y,s)-V y <p(y,s) (p(-, s) S C 2 (IR d ), (y, s) £ R d x [0, 5 M ), (71) 

as in (74). Lemma 2.7 show that A satisfy (59). 

Step 1: reduction to the monotone case A = 0. When A ^ we apply the rescaling argument 
of section 2.4: we thus introduce the time rescaling t(s) defined by (51) and the corresponding 
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measures a l s = P\i s \ as m (53), which satisfy (56) and (57) for the rescaled operators A of (54). 
Taking into account Remark 1.4 and the fact that a\ = f>l( s \, the thesis follows if we show that 

C fc (^,of ) < C h {a l Q ,al) for every s G [0, S^), (72) 

(see (52) for the definition of Sea)- 
Step 2: // 

Ch « , <r 2 si ) < C h {a\ a , a 2 Q ) for every 0< a < *i < S<x>, (73) 

then (72) holds. When h is bounded, (73) implies (72) by taking a simple limit as sq 4- and using 
the fact that the map (a 1 , a 2 ) h-> ^(cr 1 , <t 2 ) is continuous with respect to weak convergence in 
£P(M. d ) x ^(R d ). If (72) holds for every bounded Lipschitz cost, then it holds for every continuous 
and nondecreasing e cost by Lemma 2.2. 
Step 3: We claim the following: 

Let 4> x , (j) 2 6 C£°(R d ) be satisfying the constraint <j) X {yi) + 4> 2 {y2) < h(\y\ - y 2 \) Then 

f d< + / ^ < C h « , < ) + I K n>m (74) 

where i := sup H d IV0 1 + sup Rti |V0 2 | and 

K n . m := [ [ \A n , m -A\dalds+ [ [ \A n , m - A\ da 2 s ds. 
Js JM d J s Jm d 

Indeed, applying [20, Theorem 3.2.1] we can introduce the solutions <Pn m ,<Pn m € C^' 1 (K d x 
[so, si]) of the backward equations 

d s ifl m + & n , m [tpi] = in R d x [s , si], ^ m (., Sl ) = in R d . 

Identity (57) shows that, for j = 1, 2, 

^(•, Sl ) d< - / ^(-, So ) d< = f ' / (i n , ro - A) • V< ro da^ ds 

JR d Js JR d 

(70) /•»! f . 

< £ / Aim - A do-^ ds 



Summing up the these equation for j = 1, 2 we obtain 

^d<+/ (f 2 da 2 si < I s ) doi + / <^(-, s ) da 2 +/K», m (75) 



Theorem 3.1 yields p^mO/ii s o) + ^n.m(2/2, s ) < ftflj/i - 2/2 1 ) which implies (74). 
Step 4: 



limsup I lim sup K n>m 1 = 0. (76) 

nf+oo ^ mf+oo ' 

Let us first notice that setting ti := t(sj) and recalling that t'(s) = c~ At ^' we have 

/ / |A, m - i|da]ds= / t'(s) / \A ntm -A\dpi (s) ds= / |Ai,m - A 1 d/4 df 

Jsn J«. d Js(to) JR d Jtn JR d 



ls JR d 

so that 



K n ,m — Kn^ m + Kn,rm ^n,m : ~ / / |Ai,m — A| dpf df j — 1,2. 
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We can estimate m by 

K, m < I I \A n . m -A n \d P ldt+ f f \A n - A\dp 3 t dt, 

Jt JM d Jto JM d 

observing that by (47), (48), and the Lebesgue Dominated Convergence Theorem we get 

lim Kl jm = r f \A n -A\dp{dt. 

Since |A„(a;)| < |A°(x)| < |A(a;)| = \B(x) — Ax| for every x £ R d , the integrability assumption 
(15), a further application of the Lebesgue Theorem, and (41) yield 



lim ( lim K n<m ) = / 1 / \A°-A\ dpi dt. 

nt+oo V rnf+oo / J tQ J W d 



(77) 



This last integrand is if A coincides with the minimal selection of A, in particular when A 
is continuous. In the general case, the regularity result of [7] shows that p\ <€i ££ d for J?f 1 a.e. 
t e (0, +oo) and (38) says that A° = A Jzf d -a.e. in R d ; therefore the last integral of (77) vanishes 
and we get (76). 
Step 5: conclusion. 

Thanks to (76), passing to the limit in (74) we obtain 

^d<+ / 4> 2 da 2 Si <C,«,<). 

Taking the supremum with respect to <fi ,4? £ C^°(K d ) and recalling Proposition 2.1 we obtain 
(73). 

Remark 4.1. As it appears from the final argument of the previous step 4, in the case when 
A = B — XI is the minimal selection A° of A (in particular when B is continuous) , we do not need 
to invoke the regularity result of [7] to conclude our proof. 

Proof of Corollary 1.2. For (a), it is sufficient to observe that e xt > 1; this implies h{r) < h\t(r) 
and so 

(16) 

Ch(p\ , Pt) < C hxt (pi ,Pt) < C h (pi ,p Q ). 



Similarly, for (a) and (b) 



(16) 

eP xt C h (plp 2 t )<C hxt (plp 2 t ) < C h {pl,pl). 



We conclude recalling that 

W p (p\p 2 ) =C h (p 1 ,p 2 ) 1/p with h(r) = \rf 
and applying (a) and (b). □ 
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